AITopics

Country:

Asia > Singapore (0.04)
Europe > Switzerland (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(19 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsFeb-18-2026, 04:02:11 GMT

Auslan-Daily: Australian Sign Language Translation for Daily Communication and News

Considering different geographic regions generally have their own native sign languages, it is valuable to establish corresponding SL T datasets to support related communication and research. Auslan, as a sign language specific to Australia, still lacks a dedicated large-scale dataset for SL T.

artificial intelligence, machine learning, natural language, (15 more...)

Country:

Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(22 more...)

Industry:

Education > Curriculum > Subject-Specific Education (0.96)
Health & Medicine (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Neural Information Processing SystemsFeb-17-2026, 23:22:30 GMT

Improving Gloss-free Sign Language Translation by Reducing Representation Density

Sign languages are the primary form of communication for millions of deaf individuals.

machine learning, natural language, signcl, (18 more...)

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Croatia > Dubrovnik-Neretva County > Dubrovnik (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.68)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Neural Information Processing SystemsFeb-12-2026, 08:39:04 GMT

5c61452daca5f0c260e683b317d13a3f-Paper-Datasets_and_Benchmarks.pdf

artificial intelligence, machine learning, natural language, (16 more...)

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > Dominican Republic (0.04)
(5 more...)

Industry:

Law (0.67)
Information Technology (0.67)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)

Neural Information Processing SystemsFeb-9-2026, 15:45:06 GMT

6cd3ac24cdb789beeaa9f7145670fcae-Paper-Conference.pdf

language recognition, recognition, sign language recognition, (15 more...)

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.71)

Neural Information Processing SystemsDec-24-2025, 06:47:36 GMT

TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation

Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences. Sign videos consist of continuous sequences of sign gestures with no clear boundaries in between. Existing SLT models usually represent sign visual features in a frame-wise manner so as to avoid needing to explicitly segmenting the videos into isolated signs. However, these methods neglect the temporal information of signs and lead to substantial ambiguity in translation. In this paper, we explore the temporal semantic structures of sign videos to learn more discriminative features. To this end, we first present a novel sign video segment representation which takes into account multiple temporal granularities, thus alleviating the need for accurate video segmentation. Taking advantage of the proposed segment representation, we develop a novel hierarchical sign video feature learning method via a temporal semantic pyramid network, called TSPNet. Specifically, TSPNet introduces an inter-scale attention to evaluate and enhance local semantic consistency of sign segments and an intra-scale attention to resolve semantic ambiguity by using non-local video context. Experiments show that our TSPNet outperforms the state-of-the-art with significant improvements on the BLEU score (from 9.58 to 13.41) and ROUGE score (from 31.80 to 34.96) on the largest commonly used SLT dataset.

hierarchical feature learning, temporal semantic pyramid, tspnet, (8 more...)

Industry: Education > Curriculum > Subject-Specific Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.79)
Information Technology > Artificial Intelligence > Natural Language (0.62)

Johnny, Samuel Ebimobowei, Guda, Blessed, Aaron, Emmanuel Enejo, Gueye, Assane

Pose-Based Sign Language Spotting via an End-to-End Encoder Architecture

arXiv.org Artificial IntelligenceDec-10-2025

Automatic Sign Language Recognition (ASLR) has emerged as a vital field for bridging the gap between deaf and hearing communities. However, the problem of sign-to-sign retrieval or detecting a specific sign within a sequence of continuous signs remains largely unexplored. We define this novel task as Sign Language Spotting. In this paper, we present a first step toward sign language retrieval by addressing the challenge of detecting the presence or absence of a query sign video within a sentence-level gloss or sign video. Unlike conventional approaches that rely on intermediate gloss recognition or text-based matching, we propose an end-to-end model that directly operates on pose keypoints extracted from sign videos. Our architecture employs an encoder-only backbone with a binary classification head to determine whether the query sign appears within the target sequence. By focusing on pose representations instead of raw RGB frames, our method significantly reduces computational cost and mitigates visual noise. We evaluate our approach on the Word Presence Prediction dataset from the WSLP 2025 shared task, achieving 61.88\% accuracy and 60.00\% F1-score. These results demonstrate the effectiveness of our pose-based framework for Sign Language Spotting, establishing a strong foundation for future research in automatic sign language retrieval and verification. Code is available at https://github.com/EbimoJohnny/Pose-Based-Sign-Language-Spotting

artificial intelligence, machine learning, natural language, (17 more...)

2512.08738

Country:

North America > United States (0.15)
Europe > Spain (0.14)
Africa > Rwanda (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Thomas, Marshall, Fish, Edward, Bowden, Richard

SignBind-LLM: Multi-Stage Modality Fusion for Sign Language Translation

arXiv.org Artificial IntelligenceDec-5-2025

Despite progress in gloss-free Sign Language Translation (SLT), traditional single modality end-to-end approaches consistently fail on two critical components of natural signing: the precise recognition of high-speed fingerspelling and the integration of asynchronous non-manual cues from the face. Recent progress in SLT with Large Language Models has side stepped this challenge, forcing a single network to learn these simultaneously resulting in poor performance when tasked with translating crucial information such as names, places, and technical terms. We introduce SignBind-LLM, a modular framework designed to overcome these limitations. Our approach employs separate, specialized predictors for continuous signing, fingerspelling, and lipreading. Each expert network first decodes its specific modality into a sequence of tokens. These parallel streams are then fused by a lightweight transformer that resolves temporal misalignments before passing the combined representation to a Large Language Model (LLM) for final sentence generation. Our method establishes a new state-of-the-art on the How2Sign, ChicagoFSWildPlus, and BOBSL datasets with a BLEU-4 score of 22.1, 73.2% letter accuracy and BLEU-4 score of 6.8 respectively. These results validate our core hypothesis: isolating and solving distinct recognition tasks before fusion provides a more powerful and effective pathway to robust, high-fidelity sign language translation.

large language model, machine learning, translation, (18 more...)

2509.0003

Country: Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Education > Curriculum > Subject-Specific Education (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Rubaiyeat, Husne Ara, Mahmud, Hasan, Hasan, Md Kamrul

Bangla Sign Language Translation: Dataset Creation Challenges, Benchmarking and Prospects

arXiv.org Artificial IntelligenceNov-27-2025

Bangla Sign Language Translation (BdSLT) has been severely constrained so far as the language itself is very low resource. Standard sentence level dataset creation for BdSLT is of immense importance for developing AI based assistive tools for deaf and hard of hearing people of Bangla speaking community. In this paper, we present a dataset, IsharaKhobor , and two subset of it for enabling research. We also present the challenges towards developing the dataset and present some way forward by benchmarking with landmark based raw and RQE embedding. We do some ablation on vocabulary restriction and canonicalization of the same within the dataset, which resulted in two more datasets, IsharaKhobor_small and IsharaKhobor_canonical_small. The dataset is publicly available at: www.kaggle.com/datasets/hasanssl/isharakhobor [1].

artificial intelligence, machine learning, natural language, (16 more...)

2511.21533

Genre: Research Report (0.51)

Industry: Education > Curriculum > Subject-Specific Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Pu, Muxin, Lim, Mei Kuan, Chong, Chun Yong, Loy, Chen Change

Sigma: Semantically Informative Pre-training for Skeleton-based Sign Language Understanding

arXiv.org Artificial IntelligenceNov-21-2025

Pre-training has proven effective for learning transferable features in sign language understanding (SLU) tasks. Recently, skeleton-based methods have gained increasing attention because they can robustly handle variations in subjects and backgrounds without being affected by appearance or environmental factors. Current SLU methods continue to face three key limitations: 1) weak semantic grounding, as models often capture low-level motion patterns from skeletal data but struggle to relate them to linguistic meaning; 2) imbalance between local details and global context, with models either focusing too narrowly on fine-grained cues or overlooking them for broader context; and 3) inefficient cross-modal learning, as constructing semantically aligned representations across modalities remains difficult. To address these, we propose Sigma, a unified skeleton-based SLU framework featuring: 1) a sign-aware early fusion mechanism that facilitates deep interaction between visual and textual modalities, enriching visual features with linguistic context; 2) a hierarchical alignment learning strategy that jointly maximises agreements across different levels of paired features from different modalities, effectively capturing both fine-grained details and high-level semantic relationships; and 3) a unified pre-training framework that combines contrastive learning, text matching and language modelling to promote semantic consistency and generalisation. Sigma achieves new state-of-the-art results on isolated sign language recognition, continuous sign language recognition, and gloss-free sign language translation on multiple benchmarks spanning different sign and spoken languages, demonstrating the impact of semantically informative pre-training and the effectiveness of skeletal data as a stand-alone solution for SLU.

artificial intelligence, machine learning, natural language, (16 more...)

2509.21223

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)